Analisis Silhouette Coefficient pada 6 Perhitungan Jarak K-Means Clustering

نویسندگان

چکیده

Clustering merupakan proses pengelompokan sekumpulan data ke dalam klaster yang memiliki kemiripan. Kemiripan satau ditentukan dengan perhitungan jarak. Untuk melihat perfoma beberapa jarak, penelitian ini penulis menguji pada 6 atribut berbeda, yakni 2, 3, 4, dan atribut. Dari hasil uji perbandingan rumus jarak K-Means clustering menggunakan Silhouette coefficient dapat disimpulkan bahwa: 1) Chebyshev distance performa stabil baik untuk sedikit maupun banyak. 2) Average paling tinggi dibandingkan pengukuran lain outliers seperti 3. 3) Mean Character Difference mendapatkan hanya 4) Euclidean distance, Manhattan Minkowski menghasilkan nilai sedikt atribut, sedangkan banyak cukup mendekati 0,5.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Persistent K-Means: Stable Data Clustering Algorithm Based on K-Means Algorithm

Identifying clusters or clustering is an important aspect of data analysis. It is the task of grouping a set of objects in such a way those objects in the same group/cluster are more similar in some sense or another. It is a main task of exploratory data mining, and a common technique for statistical data analysis This paper proposed an improved version of K-Means algorithm, namely Persistent K...

متن کامل

Adaptive K-Means Clustering

Clustering is used to organize data for efficient retrieval. One of the problems in clustering is the identification of clusters in given data. A popular technique for clustering is based on K-means such that the data is partitioned into K clusters. In this method, the number of clusters is predefined and the technique is highly dependent on the initial identification of elements that represent...

متن کامل

Constrained K-Means Clustering

We consider practical methods for adding constraints to the K-Means clustering algorithm in order to avoid local solutions with empty clusters or clusters having very few points. We often observe this phenomena when applying K-Means to datasets where the number of dimensions is n 10 and the number of desired clusters is k 20. We propose explicitly adding k constraints to the underlying clusteri...

متن کامل

An Analysis of the Application of Simplified Silhouette to the Evaluation of k-means Clustering Validity

Silhouette is one of the most popular and effective internal measures for the evaluation of clustering validity. Simplified Silhouette is a computationally simplified version of Silhouette. However, to date Simplified Silhouette has not been systematically analysed in a specific clustering algorithm. This paper analyses the application of Simplified Silhouette to the evaluation of k-means clust...

متن کامل

Subspace K-means clustering.

To achieve an insightful clustering of multivariate data, we propose subspace K-means. Its central idea is to model the centroids and cluster residuals in reduced spaces, which allows for dealing with a wide range of cluster types and yields rich interpretations of the clusters. We review the existing related clustering methods, including deterministic, stochastic, and unsupervised learning app...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Techno.COM Jurnal

سال: 2021

ISSN: ['2356-2579', '1412-2693']

DOI: https://doi.org/10.33633/tc.v20i2.4556